Jump to content

On the usage of statistics on the OH


eb45

Recommended Posts

WARNING: Major tl;dr potential here. If that's okay with you, please keep reading.

About two weeks ago, I made the following post here:

The idea of batting average being an important stat for player evaluation is like the idea of a geocentric universe to me. I'm not going to blame people for using it in the past, but now we know better, and to still rely on it is ignorant to say the least.

The amazing thing is, in many cases, given equal OBP and SLG, the player with the lower BA might be of more value to the team offensively. So with Reynolds' .320 and .433, it is actually better that he only hit .198 as opposed to .300.

Another illustration of that principle. Which player would you want nine clones of on your team, knowing nothing but these batting lines?

................AB.....H......2B....3B.....HR..... BB......BA....OBP...SLG

Player A....560....190....22.....1.....10......40.....339 ...383....436

Player B....500....130....17......1....23......100....260 ...383....436

If you said Player A, then my team will be better than yours. Based on linear weights, a team of nine Player Bs would score fifty more runs over the course of the season than a team of Player As.

The other wonderful thing about it is that BA is not as sustainable as OBP and SLG, therefore it is not particularly good as a predictive stat either.

Since 1968, there have been 77 players with seasons where their OBP and SLG values were within .010 of those in the example above. The career wOBA of the players with top ten batting averages of those 77 seasons (Player A) is .328. The bottom ten, Player B, is .365.

Now, if this were maybe .365 and .355, it would be statistically insignificant. However, the players who put up high BAs with good OBP and SLGs were mostly slightly above average hitters for their careers, indicating that their high BA performance was unsustainable, whereas players who relied more on the walks and extra base hits continued to perform at a high level.

In short, if I'm building a team for 2011, give me Mark Reynolds (2010 BA .198) over Omar Infante (2010 BA .319) any day of the week.

Read:

http://www.tangotiger.net/ops2.html

http://www.baseball-fever.com/showthread.php?57135-With-equal-OBP-and-SLG-Higher-BA-or-lower-BA

While I'm proud of that post, and a few people evidently liked it enough to rep me for it, I think most people on this board are beyond the point where they consider batting average to be a worthwhile stat for player evaluation. That's a good thing, because as I tried to demonstrate above, batting average is next to worthless, especially for predictive purposes.

However, I regularly see people on OH (for example, this Frobby thread) using OPS as essentially the new standard offensive stat, replacing what BA was for the longest time. And honestly, I'm not really okay with that.

Why? Well, first I'll copy the example from my post above.

................AB.....H......2B....3B.....HR..... BB......BA....OBP...SLG

Player A....560....190....22.....1.....10......40.....339 ...383....436

Player B....500....130....17......1....23......100....260 ...383....436

If you're using OPS, Players A and B are equal. While that's better than what BA would suggest, that Player A is significantly better than Player B, it's still not true. A team of nine Player Bs would score an estimated fifty runs more than a team of Player As. So OPS is clearly not the best evaluator of value.

The same argument about the predictive qualities applies. I'm not going to rehash it here beyond that the players the closest to the Player B line since 1969 are much better than the players closest to Player A, therefore OPS is not a particularly good predictive stat either.

Sidebar: I can't figure out anywhere else to work this into this post, but I feel it needs to be said. If you insist on using some form of OPS, please at least use the very easy league- and park-adjusted version, OPS+. Len Gabrielson's OPS in 1968 was .765. Neifi Perez's OPS in 2001 was .771. But there's absolutely no way Neifi was the better hitter, because he played in the most hitter-friendly environment ever, 2001 Coors Field, and Gabrielson played in the least hitter-friendly environment ever, 1968 Dodger Stadium. OPS+ adjusts raw OPS numbers for the environments players played in, and sets them relative to league average, which is a 100 OPS+. So Gabrielson's OPS+ was a 137 (very good) and Perez's was 81 (bad).

But what should we use instead of OPS? Before I answer that question, I'd like to look at why OPS is kind of an okay measure of offensive performance.

For every (AB+PA), OPS applies the following values to each potential outcome:

BB/HBP: 1

1B: 2

2B: 3

3B: 4

HR: 5

The simplest way to figure this out is to actually calculate the values. If a player walks, his OBP for that plate appearance is 1.000 and his SLG is .000, for an OPS of 1.000. If he singles, that SLG becomes 1.000, and the OPS is 2.000. And so on and so forth.

EDIT (thanks skanar): These values only hold true for the first successful PA following any number of outs. Beyond that, the values can vary a bit. I apologise for my flawed math.

However, who decided a HR was five times as valuable as a walk? Or that a single was twice as valuable as a walk, and a HR was 2.5 times more valuable than a single? Does that sound right at all?

Well, it shouldn't.

As a batter, there are two aspects of your performance that can be measured: getting on base and moving runners over. OPS kind of does this, but crudely.

Here's where things get mathy. Based on empirical PBP data, people smarter than me created linear weights. The way it works, as best as I can gather, is as follows:

First, we have the potential of a runner on a certain base with a certain number of outs to score. A triple with two outs has less value than a triple with no outs in terms of scoring runs, because it is less likely that a player will score from third with two outs. We then compute the potential values for each potential situation. This number is the getting on base value.

Second, we have the amount an outcome at the plate affects the potential of runners on base to score. We again compute the values for each potential situation based on empirical PBP data. Next we multiply by the frequency of each event occurring, because while a HR can sometimes be worth four runs, it is much less common than one, two, or three. To reflect reality, we need to not assume each case has an equal possibility of occurrence.

After multiplying the "getting on" and "moving up" values of each event in each possible situation, we now have an empirical value for how much any outcome at the plate affects run scoring in any given situation. Tom Tango has compiled a chart of these based on a 1999-2002 run environment here.

The important thing in that chart is the bottom line, the average value for each event. It's not perfect, because the value of an event will always be dependent on context, but for a context-neutral large-scale evaluation (i.e. comparing players), it is as good an approximation as we can get. It's a whole lot better than the values assigned by OPS, that's for sure.

So now that we've assigned these values based on outcomes from years and years of actual baseball being played at the major league level, how do we actually work them into a stat that we can easily access? Luckily, that's already been done for us. Fangraphs calls it weighted On-Base Average, or wOBA. It's scaled to league average OBP, so the league average OBP will be the same as the league average wOBA (usually in the .320-.335 range), so, like OPS+, it's relative to league, and not just raw numbers with no context.

Fangraphs uses wOBA as the basis for their offensive stats, like their estimated batting runs, which are a major component in Wins Above Replacement (WAR), a statistic many here feel comfortable using. And that's perfectly fine, so long as you recognize what you're using it for and don't try to use it for something it's not meant to be used for.

Honestly, though, I don't like WAR. And here's why (this is where I diverge from accepted research and start to make my own assertions):

WAR measures value, not quality. The two are not interchangeable. A player is not better because he posts a higher WAR value. Apart from the general problems with fielding metrics, WAR is a very good estimator of how valuable a player was to a team. But that's all it is.

First, it's a counting stat. Counting stats are inherently inferior to rate stats in evaluating quality, because they can value playing more at a lower quality over playing less at a higher quality, when the player who played at the higher quality was by definition a better player. But unlike strictly batting or fielding stats, there is no way to properly convert WAR into a rate stat. Adjusting by PA conflicts with the fielding aspect, and in an extreme case could turn a player who played only as a defensive replacement and accumulated maybe .2 WAR while only coming to the plate once into a more valuable player than someone who accumulated 8 WAR in 500 PA (.2/1 > 8/500). And adjusting by innings played would have a similar effect on players who were solely pinch hitters. So it's a counting stat, and must remain a counting stat. This is okay for value, but not okay at all for quality.

Second, WAR includes a positional adjustment. In Fangraphs' explanation of the positional adjustment they use, they admit to making three assumptions that are okay to make if the goal is to measure value, but for measuring quality they cannot be made. The assumptions are:

1. Major league teams are being perfectly efficient with who they put, and where.

2. Left-handed players and right-handed players can each play every position.

3. Offensive ability is not independent of the position being played.

It has been demonstrated time and time again that the same offensive production from a left fielder is less valuable than from a shortstop. But does that necessarily make the shortstop a better player? No, it doesn't, because you're assuming that the left fielder is not a good enough player to play shortstop. Positional adjustments mean that WAR is inherently dependent on the way a team uses its players, which is when it entirely leaves the realm of measuring quality.

So all WAR is good for is measuring a player's value to his team. Which means it's great for stuff like deciding who should win the MVP award, but is that really what we want out of a stat?

Let me use an example here. The second-best (after Cliff Lee) starting pitcher in the AL in 2008 in my opinion was Justin Duchscherer. His ERA+ was 163 and his WHIP was 0.995. But his WAR (using the B-R formula) was 3.9. John Danks also had a great year, with an ERA+ of 138 and a WHIP of 1.226. Great numbers, but I don't think anyone would argue he was a better pitcher than Duchscherer based solely on quality stats. Yet Danks' WAR was 6.4, simply because he pitched more.

Yes, Danks was more valuable, but value is often inhibited by circumstances beyond the player's control, be it injury, a stubborn manager, or teams gaming the service clock. Evaluating quality is better because it removes context, thereby reaching a more pure outcome that tells us something definitive and unqualified.

In short, don't use OPS, use wOBA, but don't use WAR outside of its intended purpose. Use rate stats whenever possible. Above all, recognize what each stat seeks to measure, and never try to use it to measure something it cannot measure. Batting average does not accurately measure overall offensive quality. wOBA does that, but does not accurately measure overall player quality. There is no one stat that can do that. WAR measures value, but does not measure quality.

Basically, when making an argument, decide what you are arguing, select which stats best measure what you are arguing, and use them. By doing so, hopefully we can have discussions from which we can all learn, rather than simplistic throwing back and forth of numbers with little reason and no context.

Download the Fangraphs search engine for Firefox here.

I'm sorry this is so long.

Link to comment
Share on other sites

  • Replies 144
  • Created
  • Last Reply

This was a good and thorough explanation of some important misconceptions.

I am often guilty of the WAR confusion that you outlined-- My first instinct is usually to think that the player with the higher WAR is the better player, without considering that difference between talent and value.

Regarding OPS and OPS+, I think a reason why OPS+ isn't used more is that we are all modern fans so we all know the OPS context within which to consider our players. There's still a very relevant difference (for example, I remember reading from someone on this board that Adam Jones' OPS+ was higher in '10 than '09, even though his OPS was lower), but the conversion rates are small enough that there can be a general consensus on what is a "good" OPS and what isn't, since the context of 2010 isn't hugely different from 2009 or 2008.

Link to comment
Share on other sites

WARNING: Major tl;dr potential here. If that's okay with you, please keep reading.

But what should we use instead of OPS? Before I answer that question, I'd like to look at why OPS is kind of an okay measure of offensive performance.

For every (AB+PA), OPS applies the following values to each potential outcome:

BB/HBP: 1

1B: 2

2B: 3

3B: 4

HR: 5

The simplest way to figure this out is to actually calculate the values. If a player walks, his OBP for that plate appearance is 1.000 and his SLG is .000, for an OPS of 1.000. If he singles, that SLG becomes 1.000, and the OPS is 2.000. And so on and so forth.

However, who decided a HR was five times as valuable as a walk? Or that a single was twice as valuable as a walk, and a HR was 2.5 times more valuable than a single? Does that sound right at all?

First, nice post, and some valid points. But there's a mistake buried here: slugging percentage is calculated with At-Bats as the denominator, not Plate Appearances. So walks and HBPs just aren't a factor for slugging - they don't bring it down. That is, it's not accurate to say that a player's SLG for a plate appearance that results in a walk is .000.

Link to comment
Share on other sites

Good point. I've always seen rate stats represented as .000 before any event occurred, but you're right, that's probably not mathematically correct.

Would it be better to say that with no slugging value, a player's OPS in a PA resulting in a walk is 1.000? I'm pretty terrible at math, so I don't know if you can actually calculate 0/0 (I don't think you can).

Link to comment
Share on other sites

I agree wOBA is a much better measure than OPS. I even started a couple of threads where I swore I would use wOBA more often. (Like this one: http://forum.orioleshangout.com/forums/showthread.php?t=106755&highlight=wOBA) But, in many instances I just like using BB-ref better, and wOBA isn't available there. I'm guessing we'll use wOBA more and more often over the next few years, but I don't think the world will end if we use OPS as a readily available, easily calculable stat to estimate offensive prowess, despite its flaws, which you outlined correctly.

Link to comment
Share on other sites

Let me use an example here. The second-best (after Cliff Lee) starting pitcher in the AL in 2008 in my opinion was Justin Duchscherer. His ERA+ was 163 and his WHIP was 0.995. But his WAR (using the B-R formula) was 3.9. John Danks also had a great year, with an ERA+ of 138 and a WHIP of 1.226. Great numbers, but I don't think anyone would argue he was a better pitcher than Duchscherer based solely on quality stats. Yet Danks' WAR was 6.4, simply because he pitched more.

Answering the bell is awfully important. Usually, players with the highest WAR are playing almost every day, or in the case of starting pitchers, making all their starts and throwing a lot of innings. And when they do that, they avoid the need to put in some lesser player in their place when they are injured or being rested. Plus, in some cases rate stats are distorted because the player is used in a way where he is put in easier situations. If Luke Scott rests 15 times a year against the toughest lefties in the league, whereas Nick Markakis faces them all, I can't fault Nick because his wOBA is a little lower than Luke's. (I'm not referring to 2010 in particular, just giving an illustration.)

Link to comment
Share on other sites

tl;dr

Just kidding. But OPS is, like, orders of magnitude better than batting average at evaluating players. It's imperfect, but as far as quick-and-dirty stats get, it's incredible. A great blend of simplicity and effectiveness.

Link to comment
Share on other sites

Good point. I've always seen rate stats represented as .000 before any event occurred, but you're right, that's probably not mathematically correct.

Would it be better to say that with no slugging value, a player's OPS in a PA resulting in a walk is 1.000? I'm pretty terrible at math, so I don't know if you can actually calculate 0/0 (I don't think you can).

Really, a player doesn't have an OPS until he has at least one at-bat. You can't calculate 0/0.

So, for a player with one walk, no hits, no at-bats, he has an OBP of 1.000 and a SLG of ----.

Perhaps a better way to compare would be to say that a player with a walk and a single in two plate appearances has an OBP of 1.000 and a SLG of 1.000. A player with two walks and a single (in 3 PAs) has an OBP of 1.000 and a SLG of 1.000. And a player with one walk and two singles has an OBP of 1.000 and a SLG of 1.000.

That is not to say that walks and singles are identically valued by OPS. In 10 PAs, 1 BB and 4 singles give an OPS of .944 (OBP=.500, SLG=.444), while 4 BBs and 1 single give an OPS of .667 (OBP=.500, SLG=.167). But your simple "a single is worth 2x a walk, a HR is worth 5x a walk" method is not correct.

Link to comment
Share on other sites

Frobby: I don't disagree, but you're arguing value.

There's no real "right" way to evaluate players. wOBA is pretty much the best we have at the moment, and that only covers offense, and even then you still have to provide context (like the example of Scott sitting against tough lefties). Context is crucial in making judgments of any kind, and yes, it is entirely important to recognize that. We have a few stats that give general pictures, and there are so many more that provide context. We can't ever boil statistical analysis down to one or two super-stats, because it just doesn't work like that.

SrMeowMeow: I don't disagree at all. And until wOBA starts going on scoreboards and TV overlays, OPS will be the best available stat that you can calculate in your head. But in places like the OH, where we all have internet connections, wOBA is so much better and just as easily available.

skanar: How about this:

Assume the batter has already made one out, thereby creating a .000/.000/.000 line. A walk will make his OPS .500, a single 1.000, a double 1.500, a triple 2.000, and a home run 2.500, corresponding to the 1-5 values I listed above. Two previous outs would make the OPSs .333, .667, 1.000, 1.333, and 1.667. I would imagine that scale holds true for any number of previous outs made.

Link to comment
Share on other sites

I don't disagree, but you're arguing value.

There's no real "right" way to evaluate players. wOBA is pretty much the best we have at the moment, and that only covers offense, and even then you still have to provide context (like the example of Scott sitting against tough lefties). Context is crucial in making judgments of any kind, and yes, it is entirely important to recognize that. We have a few stats that give general pictures, and there are so many more that provide context. We can't ever boil statistical analysis down to one or two super-stats, because it just doesn't work like that.

Well, I mean, c'est la vie...that will never change :P

Link to comment
Share on other sites

That was a great post and explanation of the new stats. I completely agree that people should use better stats to argue points, etc.

I'm in finance, so I understand the value of good math in understanding things. However I also understand the limitations.

For instance there is no financial metric for an uprising in Libya causing gas prices to go up because of speculation.

Just like there is no metric in baseball for Jose Bautista becoming a power hitter of his magnitude from last season.

Also in your example of Player A vs. Player B, which player would you rather have up if your going against Cliff Lee (doesn't give up many walks) and you need a hit in the ninth inning to win the game? A guy that relies heavily on walks to get on base, or a guy that reaches base at a 33% clip when he hits a fair ball.

I know you talk about trying to better account for each situation having its own context and different weight, but that's pretty big.

What I'm trying to get at is that while sometimes its fun to play with numbers and make educated guesses or arguments, its still not stat-o-matic out there. Even though Juan Uribe or Cody Ross might not be a statistical superior player to anyone in the Yankees or Red Sox, they got the job done when it counted.

Let's just keep in mind that we should observe on the field first, then use statistical evidence to support or refute those observations. Although a team of nine Player B's may score more runs if projected into a statomaticstyle game where all pitchers are equal, theres probably a more efficient combination of like 6 player B's and 3 player A's or something like that, since there is obviously some situations where Player A's strengths are more suitable than Player B's.

I guess the beauty in baseball is there are these teams of 9 players which all have different skill sets, make ups and strengths and weaknesses which are put together to create the best "team."

I think statistics can sometimes lead people to want only 9 of one-type of player....not only is that highly unlikely, I think you need to have a good mix of strengths so that you can be dynamic enough to have success against many types of pitching in many types of situations. Obviously Cliff Lee dominated patient AL teams in the Yankees and Rays (Both Teams were 1 and 2 in BBs last season), However the Giants beat Lee twice in the WS and Lee had a 6.94 era against the team, which was 21st in the major leagues in walks.

Link to comment
Share on other sites

That was a great post and explanation of the new stats. I completely agree that people should use better stats to argue points, etc.

I'm in finance, so I understand the value of good math in understanding things. However I also understand the limitations.

For instance there is no financial metric for an uprising in Libya causing gas prices to go up because of speculation.

Just like there is no metric in baseball for Jose Bautista becoming a power hitter of his magnitude from last season.

Also in your example of Player A vs. Player B, which player would you rather have up if your going against Cliff Lee (doesn't give up many walks) and you need a hit in the ninth inning to win the game? A guy that relies heavily on walks to get on base, or a guy that reaches base at a 33% clip when he hits a fair ball.

I know you talk about trying to better account for each situation having its own context and different weight, but that's pretty big.

What I'm trying to get at is that while sometimes its fun to play with numbers and make educated guesses or arguments, its still not stat-o-matic out there. Even though Juan Uribe or Cody Ross might not be a statistical superior player to anyone in the Yankees or Red Sox, they got the job done when it counted.

Let's just keep in mind that we should observe on the field first, then use statistical evidence to support or refute those observations. Although a team of nine Player B's may score more runs if projected into a statomaticstyle game where all pitchers are equal, theres probably a more efficient combination of like 6 player B's and 3 player A's or something like that, since there is obviously some situations where Player A's strengths are more suitable than Player B's.

I guess the beauty in baseball is there are these teams of 9 players which all have different skill sets, make ups and strengths and weaknesses which are put together to create the best "team."

I think statistics can sometimes lead people to want only 9 of one-type of player....not only is that highly unlikely, I think you need to have a good mix of strengths so that you can be dynamic enough to have success against many types of pitching in many types of situations. Obviously Cliff Lee dominated patient AL teams in the Yankees and Rays (Both Teams were 1 and 2 in BBs last season), However the Giants beat Lee twice in the WS and Lee had a 6.94 era against the team, which was 21st in the major leagues in walks.

Well, now you are really getting into sample-size and randomness and significance issues. Ross, for example, was generally the same or worse in clutch situations than his overall numbers both last season and for his career.

That's why the stats are important, because they can tell you the truth when your eyes are lying to you, just like your eyes can do the same when the stats lie.

And you are making what seems to be a fairly popular logical fallacy around baseball, which is equating a knowledge of and willingness to go to the stats to some sort of disassociation with the actual games themselves. That is not at all the case, yet it refuses to die the death it deserves.

Link to comment
Share on other sites

SrMeowMeow: I don't disagree at all. And until wOBA starts going on scoreboards and TV overlays, OPS will be the best available stat that you can calculate in your head. But in places like the OH, where we all have internet connections, wOBA is so much better and just as easily available.

Even when wOBA is on scoreboards and TV overlays, you still won't be able to calculate it in your head. :P

Link to comment
Share on other sites

Can I be honest?

Detailed statistical analysis is mostly beyond me. I failed Algebra the first time and didn't get much beyond that.

I watch sports mostly to unwind. I understand batting average, home runs, and RBIs. Those are what I grew up with, and they seem to be good enough indicators of player performance to give me a reasonably accurate thumb nail sketch of things -- as long as no one is going to quiz me later. ;) Some of these new stats are complicated and unfamiliar enough that it gives me a headache to think about them too long.

I also like on-base percentage, the exception to that not liking new stats -- because I'm a walks guy. My on-base percentage was probably .900 in little league even though I couldn't hit a lick because I had a sudden epiphany along the lines of "Hey, these guys couldn't throw the ball in the strike zone if it was tied to a string" and took *a lot* of pitches. :) I know it's a little different in the bigs, but watching all those swings at first pitches last year pained me. Watching a Markakis at-bat, on the other hand -- bliss. :)

Numbers can be fun, and I get why people like these new ways of looking at a very old game, but I hope there's still a place for fans like me who can't always get their heads around them. And, you know, something like "Home Run King" sounds a lot better coming off the tongue than "OBP King", you've got to admit. ;)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.




×
×
  • Create New...